Regularization and Model Selection with Categorial Predictors and Effect Modifiers in Generalized Linear Models
نویسندگان
چکیده
Varying-coefficient models with categorical effect modifiers are considered within the framework of generalized linear models. We distinguish between nominal and ordinal effect modifiers, and propose adequate Lasso-type regularization techniques that allow for (1) selection of relevant covariates, and (2) identification of coefficient functions that are actually varying with the level of a potentially effect modifying factor. We investigate large sample properties, and show in simulation studies that the proposed approaches perform very well for finite samples, too. In addition, the presented methods are compared with alternative procedures, and applied to real-world medical data. 1. Introduction. In regression modeling, categorical predictors, also called factors, are a standard case. Nevertheless, variable selection for discrete covariates and the connected problem which categories within one factor are to be distinguished has been somewhat neglected. More concrete, in our application, we model the effects of pregnancy related covariates on the type of delivery, that is, if birth was given vaginally or by means of a Cesarean. Cases were observed over a period of several years. As medical standards typically change over time, modeling the type of delivery requires to consider discrete time-effects, and more importantly, to consider how effects change over years. In general, we are going to address model selection with discrete covariates in a slightly extended version of generalized linear models (GLMs), namely GLMs with varying coefficients. Varying-coefficient models (Hastie and Tibshirani, 1993) are a quite flexible tool to capture complex model structures and interactions. In the setting of GLMs, regression coefficients β j are allowed to vary with the value of other variables u j. Hence the linear predictor has the form
منابع مشابه
Regularization and Model Selection with Categorial Effect Modifiers
The case of continuous effect modifiers in varying-coefficient models has been well investigated. Categorial effect modifiers, however, have been largely neglected. In this paper a regularization technique is proposed that allows for selection of covariates and fusion of categories of categorial effect modifiers in a linear model. It is distinguished between nominal and ordinal variables, since...
متن کاملL1-regularization path algorithm for generalized linear models
We introduce a path following algorithm for L1-regularized generalized linear models. The L1-regularization procedure is useful especially because it, in effect, selects variables according to the amount of penalization on the L1-norm of the coefficients, in a manner that is less greedy than forward selection–backward deletion. The generalized linear model path algorithm efficiently computes so...
متن کاملVariable selection for functional linear models with functional predictors and a functional response
We consider a variable selection problem for functional linear models where both multiple predictors and a response are functions. Especially we assume that variables are given as functions of time and then construct the historical functional linear model which takes the relationship of dependences of predictors and a response into consideration. Unknown parameters included in the model are est...
متن کاملLarge-scale Inversion of Magnetic Data Using Golub-Kahan Bidiagonalization with Truncated Generalized Cross Validation for Regularization Parameter Estimation
In this paper a fast method for large-scale sparse inversion of magnetic data is considered. The L1-norm stabilizer is used to generate models with sharp and distinct interfaces. To deal with the non-linearity introduced by the L1-norm, a model-space iteratively reweighted least squares algorithm is used. The original model matrix is factorized using the Golub-Kahan bidiagonalization that proje...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کامل